Collect spatial data¶
Created by
¶ Li, Chaonan (李超男) / licn@mtc.edu.cn / Ecological Security and Protection Key Laboratory of Sichuan Province, Mianyang Normal University (绵阳师范学院生态安全与保护四川省重点实验室)
¶ Liu Chi (刘驰) / liuchi0426@126.com / College of Resources and Environment, Fujian Agriculture and Forestry University (福建农林大学资源与环境学院)
¶ Liao, Haijun (廖海君) / lihj@mtc.edu.cn / Engineering Research Center of Chuanxibei RHS Construction at Mianyang Normal University of Sichuan Province (绵阳师范学院川西北乡村人居环境建设工程研究中心)
Reviewed by
¶ Li, Xiangzhen (李香真) / lixz@fafu.edu.cn / College of Resources and Environment, Fujian Agriculture and Forestry University (福建农林大学资源与环境学院)
Analyzing the spatial patterns of microbial traits also poses a significant challenge due to the limited availability of environmental data that correspond to microbiome data, and this dearth of paired datasets is particularly prevalent in the majority of currently accessible public repositories. The geodata R package provides some method to download spatial, climate and soil properties from public repositories, and the MODIS hosts many remote-sense images of plant metrics and land cover. To a certain extent, this mitigates the difficult of limited environmental data, though the accuracy of their absolute values is relatively lower compared to those measured ones.
Yet, it is extremely challenging for most R language beginners to effectively integrate these datasets for microbial biogeography analysis, as the whole process involves manipulating diverse datasets. To solve this problem, we have designed a series of functions to download various spatial, climate, vegetation, and soil parameters from public repositories. Although the accuracy of these datasets are lower than that of measured ones, they can still be associated with microbial biogeographical features at a larger spatial scale.
Please note that all spatial data would be resampled to a same spatial resolution based on the first downloaded SpatRaster!
Now, let's go through each of these functions and see how to download environmental properties from public repositories.
Load required R packages¶
# Load three required packages
suppressMessages(require("magrittr"))
require("ggplot2") %>% suppressMessages()
require("microgeo") %>% suppressMessages()
Create a standard microgeo dataset¶
We need a standard microgeo dataset for the presentations in the section of tutorial.
# Example by using the map downloaded from the DataV.GeoAtlas
data(qtp)
map <- read_aliyun_map(adcode = c(540000, 630000, 510000)) %>% suppressMessages()
dataset.dts.aliyun <- create_dataset(mat = qtp$asv, ant = qtp$tax, met = qtp$met, map = map,
phy = qtp$tre, env = qtp$env, lon = "longitude", lat = "latitude")
dataset.dts.aliyun %>% show_dataset()
ℹ [2024-01-04 17:10:15.267327] INFO ==> all samples fall within the map area! ℹ [2024-01-04 17:10:15.276641] INFO ==> dataset has been created successfully! ℹ [2024-01-04 17:10:15.287247] INFO ==> use `object %>% show_dataset()` to check the summary of dataset.
── The Summary of Microgeo Dataset ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
! object$mat: 6808 ASVs/genes and 1244 samples [need to be subsampled!] ℹ object$ant: 6808 ASVs/genes and 7 annotation levels (Kingdom, Phylum, Class, Order, Family, Genus, Species) ℹ object$met: 1244 samples and 2 variables (longitude, latitude) ℹ object$map: a SpatialPolygonsDataFrame with the CRS of '+proj=longlat +datum=WGS84 +no_defs' ℹ object$phy: a phylogenetic tree with 6808 tip labels ℹ object$env: 1244 samples and 10 variables
• To check the summary of dataset, Replace `object` with the variable name of your dataset • For example, if the variable name is `dataset.dts`you can run `head(dataset.dts$met)` to check the content of `met`
Collect the aridity index¶
The get_ai() is implemented to download aridity index from global aridity index database. The resolution of original data is 30''. Here is a simple example.
# Download aridity index for research area
dataset.dts.aliyun %<>% get_ai(out.dir = "../dev/dat/rundata")
✔ [2024-01-04 17:10:20.571551] SAVE ==> results have been saved to: object$spa$rast$his$AI
# Show dataset
dataset.dts.aliyun %>% show_dataset()
── The Summary of Microgeo Dataset ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
! object$mat: 6808 ASVs/genes and 1244 samples [need to be subsampled!] ℹ object$ant: 6808 ASVs/genes and 7 annotation levels (Kingdom, Phylum, Class, Order, Family, Genus, Species) ℹ object$met: 1244 samples and 2 variables (longitude, latitude) ℹ object$map: a SpatialPolygonsDataFrame with the CRS of '+proj=longlat +datum=WGS84 +no_defs' ℹ object$phy: a phylogenetic tree with 6808 tip labels ℹ object$env: 1244 samples and 10 variables
── The Summary of Biogeographic Traits ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
✔ object$spa: 1 historically numeric variables; 0 historically classification variables; 0 groups of future climate data
• To check the summary of dataset, Replace `object` with the variable name of your dataset • For example, if the variable name is `dataset.dts`you can run `head(dataset.dts$met)` to check the content of `met`
# Visualize the aridity index
options(repr.plot.width = 13.43 * 2, repr.plot.height = 7.9)
q1 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$AI,
color = colorRampPalette(RColorBrewer::brewer.pal(11, "RdYlGn"))(100)) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
q2 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$AI, breaks = c(0.03, 0.2, 0.5, 0.65),
color = RColorBrewer::brewer.pal(11, "RdYlGn")[c(1,3,5,9,11)], labels = c("HAR", "AR", "SER", "SHR", "HR")) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
cowplot::plot_grid(q1, q2, align = 'hv', ncol = 2, labels = c("a", "b"), label_size = 28) %>% suppressWarnings()
Collect the elevation¶
The get_elev() is implemented to download elevation data from WorldClim database version 2.1. You are allowed to specify a spatial resolution. Here is a simple example based on the resolution of 2.5'.
# Download elevation for research area
dataset.dts.aliyun %<>% get_elev(res = 2.5, out.dir = "../dev/dat/rundata")
✔ [2024-01-04 17:10:30.039436] SAVE ==> results have been saved to: object$spa$rast$his$ELEV
# Show dataset
dataset.dts.aliyun %>% show_dataset()
── The Summary of Microgeo Dataset ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
! object$mat: 6808 ASVs/genes and 1244 samples [need to be subsampled!] ℹ object$ant: 6808 ASVs/genes and 7 annotation levels (Kingdom, Phylum, Class, Order, Family, Genus, Species) ℹ object$met: 1244 samples and 2 variables (longitude, latitude) ℹ object$map: a SpatialPolygonsDataFrame with the CRS of '+proj=longlat +datum=WGS84 +no_defs' ℹ object$phy: a phylogenetic tree with 6808 tip labels ℹ object$env: 1244 samples and 10 variables
── The Summary of Biogeographic Traits ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
✔ object$spa: 2 historically numeric variables; 0 historically classification variables; 0 groups of future climate data
• To check the summary of dataset, Replace `object` with the variable name of your dataset • For example, if the variable name is `dataset.dts`you can run `head(dataset.dts$met)` to check the content of `met`
# Visualize the elevation
options(repr.plot.width = 13.43 * 2, repr.plot.height = 7.9)
q3 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$ELEV) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
q4 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$ELEV, breaks = c(3000, 4000, 5000, 6000),
labels = c("<3000", "3000-4000", "4000-5000", "5000-6000", ">6000")) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
cowplot::plot_grid(q3, q4, align = 'hv', ncol = 2, labels = c("a", "b"), label_size = 28) %>% suppressWarnings()
Collect the historical bioclimatic variables¶
The get_his_bioc() is implemented to download 19 historically bioclimatic variables from WorldClim database version 2.1. You are allowed to specify a spatial resolution. Here is a simple example based on the resolution of 2.5'.
# Download 19 historically bioclimatic variables of research area
dataset.dts.aliyun %<>% get_his_bioc(res = 2.5, out.dir = "../dev/dat/rundata")
✔ [2024-01-04 17:10:52.243656] SAVE ==> results have been saved to: object$spa$rast$his(19 variables)
# Show dataset
dataset.dts.aliyun %>% show_dataset()
── The Summary of Microgeo Dataset ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
! object$mat: 6808 ASVs/genes and 1244 samples [need to be subsampled!] ℹ object$ant: 6808 ASVs/genes and 7 annotation levels (Kingdom, Phylum, Class, Order, Family, Genus, Species) ℹ object$met: 1244 samples and 2 variables (longitude, latitude) ℹ object$map: a SpatialPolygonsDataFrame with the CRS of '+proj=longlat +datum=WGS84 +no_defs' ℹ object$phy: a phylogenetic tree with 6808 tip labels ℹ object$env: 1244 samples and 10 variables
── The Summary of Biogeographic Traits ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
✔ object$spa: 21 historically numeric variables; 0 historically classification variables; 0 groups of future climate data
• To check the summary of dataset, Replace `object` with the variable name of your dataset • For example, if the variable name is `dataset.dts`you can run `head(dataset.dts$met)` to check the content of `met`
# Visualize the Bio1 (Mean annual temperature)
options(repr.plot.width = 13.43 * 2, repr.plot.height = 7.9)
q5 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$Bio1) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
q6 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$Bio1, breaks = c(-1, 0, 1), labels = c("A", "B", "C", "D")) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
cowplot::plot_grid(q5, q6, align = 'hv', ncol = 2, labels = c("a", "b"), label_size = 28) %>% suppressWarnings()
# Visualize the Bio12 (Mean annual precipitation)
options(repr.plot.width = 13.43 * 2, repr.plot.height = 7.9)
q7 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$Bio12) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
q8 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$Bio12, breaks = c(100, 300, 500), labels = c("<100", "100-300", "300-500", ">500")) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
cowplot::plot_grid(q7, q8, align = 'hv', ncol = 2, labels = c("a", "b"), label_size = 28) %>% suppressWarnings()
Collect the futrue bioclimatic variables¶
The get_fut_bioc() is implemented to download 19 future bioclimatic variables from WorldClim database version 2.1. You are allowed to specify a spatial resolution. Here is a simple example based on the resolution of 2.5'.
# Download futrue bioclimatic variables of research area
dataset.dts.aliyun %<>% get_fut_bioc(res = 2.5, out.dir = "../dev/dat/rundata")
✔ [2024-01-04 17:11:41.465586] SAVE ==> results have been saved to: object$spa$rast$fut [19 variables; 4 groups]
# Show dataset
dataset.dts.aliyun %>% show_dataset()
── The Summary of Microgeo Dataset ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
! object$mat: 6808 ASVs/genes and 1244 samples [need to be subsampled!] ℹ object$ant: 6808 ASVs/genes and 7 annotation levels (Kingdom, Phylum, Class, Order, Family, Genus, Species) ℹ object$met: 1244 samples and 2 variables (longitude, latitude) ℹ object$map: a SpatialPolygonsDataFrame with the CRS of '+proj=longlat +datum=WGS84 +no_defs' ℹ object$phy: a phylogenetic tree with 6808 tip labels ℹ object$env: 1244 samples and 10 variables
── The Summary of Biogeographic Traits ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
✔ object$spa: 21 historically numeric variables; 0 historically classification variables; 4 groups of future climate data
• To check the summary of dataset, Replace `object` with the variable name of your dataset • For example, if the variable name is `dataset.dts`you can run `head(dataset.dts$met)` to check the content of `met`
# Visualize the Bio1 (Mean annual temperature) of `BCC-CSM2-MR|ssp585|2061-2080`
options(repr.plot.width = 13.43 * 2, repr.plot.height = 7.9)
q9 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$fut$`BCC-CSM2-MR|ssp585|2061-2080`$Bio1) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
q10 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$fut$`BCC-CSM2-MR|ssp585|2061-2080`$Bio1, breaks = c(-1, 0, 1), labels = c("A", "B", "C", "D")) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
cowplot::plot_grid(q9, q10, align = 'hv', ncol = 2, labels = c("a", "b"), label_size = 28) %>% suppressWarnings()
# Visualize the Bio12 (Mean annual precipitation) of `BCC-CSM2-MR|ssp585|2061-2080`
options(repr.plot.width = 13.43 * 2, repr.plot.height = 7.9)
q11 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$fut$`BCC-CSM2-MR|ssp585|2061-2080`$Bio12) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
q12 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$fut$`BCC-CSM2-MR|ssp585|2061-2080`$Bio12, breaks = c(100, 300, 500), labels = c("<100", "100-300", "300-500", ">500")) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
cowplot::plot_grid(q11, q12, align = 'hv', ncol = 2, labels = c("a", "b"), label_size = 28) %>% suppressWarnings()
Collect the numeric metrics from MODIS¶
# Show all avaliable numeric MODIS metrics
show_modis_num_metrics()
| measure | resolution | type | name | sds | unit | scale.factor | |
|---|---|---|---|---|---|---|---|
| <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | |
| 1 | NDVI | 250 | Terra | MOD13Q1 | 250m 16 days NDVI | no unit | 1e-04 |
| 2 | NDVI | 500 | Terra | MOD13A1 | 500m 16 days NDVI | no unit | 1e-04 |
| 3 | NDVI | 1000 | Terra | MOD13A2 | 1 km 16 days NDVI | no unit | 1e-04 |
| 4 | NDVI | 250 | Aqua | MYD13Q1 | 250m 16 days NDVI | no unit | 1e-04 |
| 5 | NDVI | 500 | Aqua | MYD13A1 | 500m 16 days NDVI | no unit | 1e-04 |
| 6 | NDVI | 1000 | Aqua | MYD13A2 | 1 km 16 days NDVI | no unit | 1e-04 |
| 7 | EVI | 250 | Terra | MOD13Q1 | 250m 16 days EVI | no unit | 1e-04 |
| 8 | EVI | 500 | Terra | MOD13A1 | 500m 16 days EVI | no unit | 1e-04 |
| 9 | EVI | 1000 | Terra | MOD13A2 | 1 km 16 days EVI | no unit | 1e-04 |
| 10 | EVI | 250 | Aqua | MYD13Q1 | 250m 16 days EVI | no unit | 1e-04 |
| 11 | EVI | 500 | Aqua | MYD13A1 | 500m 16 days EVI | no unit | 1e-04 |
| 12 | EVI | 1000 | Aqua | MYD13A2 | 1 km 16 days EVI | no unit | 1e-04 |
| 13 | GPP | 500 | Terra | MOD17A2HGF | Gpp_500m | kg C/m^2 | 1e-04 |
| 14 | GPP | 500 | Aqua | MYD17A2H | Gpp_500m | kg C/m^2 | 1e-04 |
| 15 | PsnNet | 500 | Terra | MOD17A2HGF | PsnNet_500m | kg C/m^2 | 1e-04 |
| 16 | PsnNet | 500 | Aqua | MYD17A2H | PsnNet_500m | kg C/m^2 | 1e-04 |
| 17 | NPP | 500 | Terra | MOD17A3HGF | Npp_500m | kg C/m^2 | 1e-04 |
| 18 | NPP | 500 | Aqua | MYD17A3HGF | Npp_500m | kg C/m^2 | 1e-04 |
| 19 | LST | 1000 | Terra | MOD11A1 | LST_Day_1km | Kelvin | 0.02 |
| 20 | LST | 1000 | Aqua | MYD11A1 | LST_Day_1km | Kelvin | 0.02 |
| 21 | ET | 500 | Terra | MOD16A2 | ET_500m | kg/m²/8day | 0.1 |
| 22 | ET | 500 | Aqua | MYD16A2 | ET_500m | kg/m²/8day | 0.1 |
| 23 | PET | 500 | Terra | MOD16A2 | PET_500m | kg/m²/8day | 0.1 |
| 24 | PET | 500 | Aqua | MYD16A2 | PET_500m | kg/m²/8day | 0.1 |
| 25 | LE | 500 | Terra | MOD16A2 | LE_500m | J/m²/day | 10000 |
| 26 | LE | 500 | Aqua | MYD16A2 | LE_500m | J/m²/day | 10000 |
| 27 | PLE | 500 | Terra | MOD16A2 | PLE_500m | J/m²/day | 10000 |
| 28 | PLE | 500 | Aqua | MYD16A2 | PLE_500m | J/m²/day | 10000 |
# Download numeric metrics from MODIS
# Please replace <username> and <passwd> with real value
dataset.dts.aliyun %<>% get_modis_num_metrics(username = "username", password = "passwd", date.ran = c("2019-08-01|2019-09-01", "2020-08-01|2020-09-01"),
measures = c("NDVI", "EVI"), out.dir = "../dev/dat/rundata")
ℹ [2024-01-04 17:11:55.234416] INFO ==> preparing MODIS product list for searching... ℹ [2024-01-04 17:11:55.243476] INFO ==> searching avaliable MODIS products... ℹ [2024-01-04 17:11:55.25067] INFO ==> current product (1/2): MOD13A2 (NDVI|EVI--> 2019-08-01 to 2019-09-01) ℹ [2024-01-04 17:11:57.686182] INFO ==> current product (2/2): MOD13A2 (NDVI|EVI--> 2020-08-01 to 2020-09-01) ℹ [2024-01-04 17:12:01.529056] INFO ==> find 48 files with 0.97 GB in total... ℹ [2024-01-04 17:12:01.546177] INFO ==> downloading all avaliable MODIS products[skip if the file exists]... ℹ [2024-01-04 17:12:01.561643] INFO ==> preparing the PTVs (Product, Time, Version) for merging remote-sensing images... ℹ [2024-01-04 17:12:01.653311] INFO ==> converting hdf files to tif files... ℹ [2024-01-04 17:12:01.662253] INFO ==> current product (1/1): MOD13A2 (convert 48 hdf files into 12 tif files using 12 threads) ℹ [2024-01-04 17:12:01.932387] INFO ==> calculating average values based on date range... ℹ [2024-01-04 17:12:01.948657] INFO ==> current measure (1/2): NDVI_061(30 threads) ℹ [2024-01-04 17:12:01.985294] INFO ==> current measure (2/2): EVI_061(30 threads) ℹ [2024-01-04 17:12:02.00275] INFO ==> reprojecting the CRS of SpatRaster to epsg:+proj=longlat +datum=WGS84 +no_defs, it takes a while... ✔ [2024-01-04 17:12:15.695229] SAVE ==> results have been saved to: object$spa$rast$his$NDVI ℹ [2024-01-04 17:12:15.721029] INFO ==> reprojecting the CRS of SpatRaster to epsg:+proj=longlat +datum=WGS84 +no_defs, it takes a while... ✔ [2024-01-04 17:12:29.558543] SAVE ==> results have been saved to: object$spa$rast$his$EVI
# Show dataset
dataset.dts.aliyun %>% show_dataset()
── The Summary of Microgeo Dataset ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
! object$mat: 6808 ASVs/genes and 1244 samples [need to be subsampled!] ℹ object$ant: 6808 ASVs/genes and 7 annotation levels (Kingdom, Phylum, Class, Order, Family, Genus, Species) ℹ object$met: 1244 samples and 2 variables (longitude, latitude) ℹ object$map: a SpatialPolygonsDataFrame with the CRS of '+proj=longlat +datum=WGS84 +no_defs' ℹ object$phy: a phylogenetic tree with 6808 tip labels ℹ object$env: 1244 samples and 10 variables
── The Summary of Biogeographic Traits ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
✔ object$spa: 23 historically numeric variables; 0 historically classification variables; 4 groups of future climate data
• To check the summary of dataset, Replace `object` with the variable name of your dataset • For example, if the variable name is `dataset.dts`you can run `head(dataset.dts$met)` to check the content of `met`
# Visualize the NDVI and EVI
options(repr.plot.width = 13.43 * 2, repr.plot.height = 7.9)
q13 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$NDVI) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
q14 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$EVI) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
cowplot::plot_grid(q13, q14, align = 'hv', ncol = 2, labels = c("a", "b"), label_size = 28) %>% suppressWarnings()
Collect the classification metrics from MODIS¶
# Show all avaliable classification metrics
show_modis_cla_metrics()
| measure | resolution | type | name | sds | unit | scale.factor | |
|---|---|---|---|---|---|---|---|
| <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | |
| 1 | LC_Type1 | Moderate | Combined | MCD12Q1 | LC_Type1 | no unit | NA |
| 2 | LC_Type2 | Moderate | Combined | MCD12Q1 | LC_Type2 | no unit | NA |
| 3 | LC_Type3 | Moderate | Combined | MCD12Q1 | LC_Type3 | no unit | NA |
| 4 | LC_Type4 | Moderate | Combined | MCD12Q1 | LC_Type4 | no unit | NA |
| 5 | LC_Type5 | Moderate | Combined | MCD12Q1 | LC_Type5 | no unit | NA |
| 6 | LC_Prop1 | Moderate | Combined | MCD12Q1 | LC_Prop1 | no unit | NA |
| 7 | LC_Prop2 | Moderate | Combined | MCD12Q1 | LC_Prop2 | no unit | NA |
| 8 | LC_Prop3 | Moderate | Combined | MCD12Q1 | LC_Prop3 | no unit | NA |
| 9 | LC_Prop1_Assessment | Moderate | Combined | MCD12Q1 | LC_Prop1_Assessment | no unit | NA |
| 10 | LC_Prop2_Assessment | Moderate | Combined | MCD12Q1 | LC_Prop2_Assessment | no unit | NA |
| 11 | LC_Prop3_Assessment | Moderate | Combined | MCD12Q1 | LC_Prop3_Assessment | no unit | NA |
# Download classification metrics from MODIS
# Please replace <username> and <passwd> with real value
dataset.dts.aliyun %<>% get_modis_cla_metrics(username = "username", password = "passwd",
measures = "LC_Type1", out.dir = "../dev/dat/rundata")
ℹ [2024-01-04 17:12:36.405507] INFO ==> preparing MODIS product list for searching... ℹ [2024-01-04 17:12:36.415375] INFO ==> searching avaliable MODIS products... ℹ [2024-01-04 17:12:36.424059] INFO ==> current product (1/1): MCD12Q1 (LC_Type1--> 2022-01-01 to 2022-12-31) ℹ [2024-01-04 17:12:37.297119] INFO ==> find 8 files with 0.09 GB in total... ℹ [2024-01-04 17:12:37.306967] INFO ==> downloading all avaliable MODIS products[skip if the file exists]... ℹ [2024-01-04 17:12:37.317382] INFO ==> preparing the PTVs (Product, Time, Version) for merging remote-sensing images... ℹ [2024-01-04 17:12:37.335075] INFO ==> converting hdf files to tif files... ℹ [2024-01-04 17:12:37.346099] INFO ==> current product (1/1): MCD12Q1 (convert 8 hdf files into 1 tif files using 1 threads) ℹ [2024-01-04 17:12:37.576444] INFO ==> collecting all merged image files... ℹ [2024-01-04 17:12:37.590373] INFO ==> current measure (1/1): LC_Type1_061 ℹ [2024-01-04 17:12:37.638847] INFO ==> reprojecting the CRS of SpatRaster to epsg:+proj=longlat +datum=WGS84 +no_defs, it takes a while... ✔ [2024-01-04 17:13:07.54895] SAVE ==> results have been saved to: object$spa$rast$cla$LC_Type1
# Show dataset
dataset.dts.aliyun %>% show_dataset()
── The Summary of Microgeo Dataset ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
! object$mat: 6808 ASVs/genes and 1244 samples [need to be subsampled!] ℹ object$ant: 6808 ASVs/genes and 7 annotation levels (Kingdom, Phylum, Class, Order, Family, Genus, Species) ℹ object$met: 1244 samples and 2 variables (longitude, latitude) ℹ object$map: a SpatialPolygonsDataFrame with the CRS of '+proj=longlat +datum=WGS84 +no_defs' ℹ object$phy: a phylogenetic tree with 6808 tip labels ℹ object$env: 1244 samples and 10 variables
── The Summary of Biogeographic Traits ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
✔ object$spa: 23 historically numeric variables; 1 historically classification variables; 4 groups of future climate data
• To check the summary of dataset, Replace `object` with the variable name of your dataset • For example, if the variable name is `dataset.dts`you can run `head(dataset.dts$met)` to check the content of `met`
# We need to subset classification data before the visualization. Otherwise, it may take a very long time.
# Due to the high data resolution, it is not recommended to use `add_crs()` for visualization here. Otherwise, it may take a very long time.
options(repr.plot.width = 16.4, repr.plot.height = 8.02)
g.spat.raster <- subset_cla_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$cla$LC_Type1, use.class = c(10, 16)) # Only display grassland and barren
plot_bmap(map = dataset.dts.aliyun$map) %>% add_spatraster(spat.raster = g.spat.raster, color = c("darkgreen", "brown"))
Collect the world soil properties¶
The get_soilgrid() is implemented to download soil metrics from SoilGRIDS In the current version, there are 8 avaliable metrics. Here is a simple example.
# Download soil metrics from the SoilGRIDS for the research region
dataset.dts.aliyun %<>% get_soilgrid(depth = 5, measures = c("phh2o", "soc"), out.dir = "../dev/dat/rundata")
✔ [2024-01-04 17:13:23.655202] SAVE ==> results have been saved to: object$spa$rast$his$phh2o ✔ [2024-01-04 17:13:31.002808] SAVE ==> results have been saved to: object$spa$rast$his$soc
# Show dataset
dataset.dts.aliyun %>% show_dataset()
── The Summary of Microgeo Dataset ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
! object$mat: 6808 ASVs/genes and 1244 samples [need to be subsampled!] ℹ object$ant: 6808 ASVs/genes and 7 annotation levels (Kingdom, Phylum, Class, Order, Family, Genus, Species) ℹ object$met: 1244 samples and 2 variables (longitude, latitude) ℹ object$map: a SpatialPolygonsDataFrame with the CRS of '+proj=longlat +datum=WGS84 +no_defs' ℹ object$phy: a phylogenetic tree with 6808 tip labels ℹ object$env: 1244 samples and 10 variables
── The Summary of Biogeographic Traits ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
✔ object$spa: 25 historically numeric variables; 1 historically classification variables; 4 groups of future climate data
• To check the summary of dataset, Replace `object` with the variable name of your dataset • For example, if the variable name is `dataset.dts`you can run `head(dataset.dts$met)` to check the content of `met`
# Visualize the phh2o and soc
options(repr.plot.width = 13.43 * 2, repr.plot.height = 7.9)
q15 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$phh2o) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
q16 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$soc) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
cowplot::plot_grid(q15, q16, align = 'hv', ncol = 2, labels = c("a", "b"), label_size = 28) %>% suppressWarnings()
Collect the soil properties of China¶
This function of get_soilcn() is used to process soil metrics downloaded from here. Due to the limitations in copyrights, the spatial data of China soil properties should be manually downloaded. Here is a simple example.
# Process soil metrics of CHINA for a research area
# You should download the .nc files, and then place theme into `../dev/dat/rundata/soilchina_products` before run `get_soilcn()`
dataset.dts.aliyun %<>% get_soilcn(depth = 0.045, measures = c("PH", "SOM"), out.dir = "../dev/dat/rundata")
✔ [2024-01-04 17:13:41.836801] SAVE ==> results have been saved to: object$spa$rast$his$PH ✔ [2024-01-04 17:13:49.22903] SAVE ==> results have been saved to: object$spa$rast$his$SOM
# Show dataset
dataset.dts.aliyun %>% show_dataset()
── The Summary of Microgeo Dataset ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
! object$mat: 6808 ASVs/genes and 1244 samples [need to be subsampled!] ℹ object$ant: 6808 ASVs/genes and 7 annotation levels (Kingdom, Phylum, Class, Order, Family, Genus, Species) ℹ object$met: 1244 samples and 2 variables (longitude, latitude) ℹ object$map: a SpatialPolygonsDataFrame with the CRS of '+proj=longlat +datum=WGS84 +no_defs' ℹ object$phy: a phylogenetic tree with 6808 tip labels ℹ object$env: 1244 samples and 10 variables
── The Summary of Biogeographic Traits ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
✔ object$spa: 27 historically numeric variables; 1 historically classification variables; 4 groups of future climate data
• To check the summary of dataset, Replace `object` with the variable name of your dataset • For example, if the variable name is `dataset.dts`you can run `head(dataset.dts$met)` to check the content of `met`
# Visualize the PH and SOM
options(repr.plot.width = 13.43 * 2, repr.plot.height = 7.9)
q17 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$PH) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
q18 <- plot_bmap(map = dataset.dts.aliyun$map) %>%
add_spatraster(spat.raster = dataset.dts.aliyun$spa$rast$his$SOM) %>%
add_north_arrow() %>% add_scale_bar() %>% add_crs()
cowplot::plot_grid(q17, q18, align = 'hv', ncol = 2, labels = c("a", "b"), label_size = 28) %>% suppressWarnings()
Extract spatial data for each sample¶
After all spatial data have been successfully downloaded, we can use the extract_data_from_spatraster() function to extract these spatial data based on the latitude and longitude of the samples. Here is a simple example.
# Show spatial variable names before the extraction
dataset.dts.aliyun %<>% get_spa_vars()
dataset.dts.aliyun$spa$unit
| measure | unit | |
|---|---|---|
| <chr> | <chr> | |
| 1 | ELEV | m |
| 2 | AI | |
| 3 | Bio1 | degree centigrade |
| 4 | Bio2 | degree centigrade |
| 5 | Bio3 | degree centigrade |
| 6 | Bio4 | degree centigrade |
| 7 | Bio5 | degree centigrade |
| 8 | Bio6 | degree centigrade |
| 9 | Bio7 | degree centigrade |
| 10 | Bio8 | degree centigrade |
| 11 | Bio9 | degree centigrade |
| 12 | Bio10 | degree centigrade |
| 13 | Bio11 | degree centigrade |
| 14 | Bio12 | mm |
| 15 | Bio13 | mm |
| 16 | Bio14 | mm |
| 17 | Bio15 | mm |
| 18 | Bio16 | mm |
| 19 | Bio17 | mm |
| 20 | Bio18 | mm |
| 21 | Bio19 | mm |
| 22 | NDVI | |
| 23 | EVI | |
| 24 | PH | |
| 25 | SOM | g/100g |
| 26 | phh2o | |
| 27 | soc | g kg^-1 |
| 28 | LC_Type1 |
# Extract spatial data for samples
# A `data.frame` of historically spatial variables for each sample is avaliable at `dataset.dts.aliyun$spa$tabs`
dataset.dts.aliyun %<>% extract_data_from_spatraster() %>% suppressWarnings()
head(dataset.dts.aliyun$spa$tabs)
! [2024-01-04 17:13:55.036938] WARN ==> Some samples were failed to be applied for extraction. use `remove.na = FALSE` to check them! ✔ [2024-01-04 17:13:55.048203] SAVE ==> results have been saved to: object$spa$tabs
| LC_Type1 | AI | ELEV | Bio1 | Bio2 | Bio3 | Bio4 | Bio5 | Bio6 | Bio7 | ⋯ | Bio16 | Bio17 | Bio18 | Bio19 | NDVI | EVI | phh2o | soc | PH | SOM | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| <fct> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | ⋯ | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | |
| s1 | 10 | 0.4763 | 4077.8 | 0.05243338 | 14.9598 | 41.23052 | 776.2708 | 15.9224 | -20.3608 | 36.2832 | ⋯ | 298.4 | 14 | 298.4 | 15.4 | 0.7096171 | 0.5045436 | 7.0 | 58.2 | 7.781 | 7.354 |
| s2 | 10 | 0.4763 | 4077.8 | 0.05243338 | 14.9598 | 41.23052 | 776.2708 | 15.9224 | -20.3608 | 36.2832 | ⋯ | 298.4 | 14 | 298.4 | 15.4 | 0.7096171 | 0.5045436 | 7.0 | 58.2 | 7.781 | 7.354 |
| s3 | 10 | 0.4763 | 4077.8 | 0.05243338 | 14.9598 | 41.23052 | 776.2708 | 15.9224 | -20.3608 | 36.2832 | ⋯ | 298.4 | 14 | 298.4 | 15.4 | 0.7096171 | 0.5045436 | 7.0 | 58.2 | 7.781 | 7.354 |
| s4 | 10 | 0.4763 | 4077.8 | 0.05243338 | 14.9598 | 41.23052 | 776.2708 | 15.9224 | -20.3608 | 36.2832 | ⋯ | 298.4 | 14 | 298.4 | 15.4 | 0.7096171 | 0.5045436 | 7.0 | 58.2 | 7.781 | 7.354 |
| s5 | 10 | 0.4763 | 4077.8 | 0.05243338 | 14.9598 | 41.23052 | 776.2708 | 15.9224 | -20.3608 | 36.2832 | ⋯ | 298.4 | 14 | 298.4 | 15.4 | 0.7096171 | 0.5045436 | 7.0 | 58.2 | 7.781 | 7.354 |
| s6 | 10 | 0.4874 | 4103.2 | -0.04426666 | 14.9932 | 41.31165 | 775.4506 | 15.8096 | -20.4832 | 36.2928 | ⋯ | 299.6 | 14 | 299.6 | 15.6 | 0.7303110 | 0.5289111 | 6.9 | 63.5 | 7.781 | 7.354 |
# Check sample numbers
dim(dataset.dts.aliyun$spa$tabs)
- 1095
- 28
# Finally, we tidy up the dataset again
dataset.dts.aliyun %<>% rarefy_count_table()
dataset.dts.aliyun %<>% tidy_dataset()
dataset.dts.aliyun %>% show_dataset()
ℹ [2024-01-04 17:13:57.773302] INFO ==> the ASV/gene abundance table has been rarefied with a sub-sample depth of 5310
── The Summary of Microgeo Dataset ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
ℹ object$mat: 6807 ASVs/genes and 1095 samples [subsample depth: 5310] ℹ object$ant: 6807 ASVs/genes and 7 annotation levels (Kingdom, Phylum, Class, Order, Family, Genus, Species) ℹ object$met: 1095 samples and 2 variables (longitude, latitude) ℹ object$map: a SpatialPolygonsDataFrame with the CRS of '+proj=longlat +datum=WGS84 +no_defs' ℹ object$phy: a phylogenetic tree with 6807 tip labels ℹ object$env: 1095 samples and 10 variables
── The Summary of Biogeographic Traits ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
✔ object$spa: 27 historically numeric variables; 1 historically classification variables; 4 groups of future climate data
• To check the summary of dataset, Replace `object` with the variable name of your dataset • For example, if the variable name is `dataset.dts`you can run `head(dataset.dts$met)` to check the content of `met`